About the data

This data was downloaded from Albemarle.org. 1 All analyses were performed using the R statistical computing environment (R Core Team 2024), using version 4.3.3 on 2024-04-03.

Below is a summary of the data, courtesy of the {modelsummary} package (Arel-Bundock 2022).

modelsummary::datasummary_skim(homes, type = "numeric")
tinytable_ytlpg1qeyoh9jvhlsfcr
Unique Missing Pct. Mean SD Min Median Max Histogram
lotsize 7755 0 4.6 21.3 0.0 0.6 1067.2
totalvalue 10750 0 582692.9 495109.1 7600.0 464900.0 16272400.0
lastsaleprice 6416 1 297446.8 656670.5 0.0 220000.0 58000000.0
yearbuilt 229 0 1984.1 37.7 1668.0 1991.0 2024.0
yearremodeled 78 91 2004.8 14.9 1901.0 2007.0 2023.0
finsqft 4184 0 2114.1 983.3 144.0 1924.0 19776.0
bedroom 13 0 3.4 0.9 0.0 3.0 18.0
fullbath 14 0 2.4 1.0 0.0 2.0 14.0
age 229 0 39.9 37.7 0.0 33.0 356.0
remodeled 2 0 0.1 0.3 0.0 0.0 1.0

Home Prices

Median total value and lot size by high school district.

stats <- aggregate(cbind(totalvalue, lotsize) ~ hsdistrict, 
                   data = homes, median)
knitr::kable(stats, col.names = c("High School District", 
                                  "Median Home Value", 
                                  "Median Lot Size"), 
             align = "lcc")
High School District Median Home Value Median Lot Size
Albemarle 429800 0.2788
Monticello 446300 1.0330
Western Albemarle 574800 1.3990

The current real estate tax rate in Albemarle County is $.854 per hundred of assessed value 2. For example, a home valued at $400,000 will owe the following:

\[ \frac{\$400,000}{100} \times \$0.854 = \$3,416 \]

Visualize Data

Plot of total value versus finished square feet using ggplot2 (Wickham 2016).

ggplot(homes) +
  aes(x = finsqft, y = totalvalue) +
  geom_point() +
  scale_x_log10() +
  scale_y_log10() +
  facet_wrap(~hsdistrict) +
  labs(caption = "Note: axes on log10 scale.")

Over 18% of homes in the Western Albemarle school district are assessed at over $1,000,000.

Model total value

We fit two models with finished square feet and high school district as predictors. The latter model includes an interaction.

tinytable_q44prfeyy6h5dkc95a4e
model 1 model 2
(Intercept) 4.571 5.647
(0.034) (0.061)
log(finsqft) 1.122 0.979
(0.004) (0.008)
Monticello -0.008 -1.467
(0.004) (0.082)
Western Albemarle 0.103 -1.524
(0.005) (0.085)
log(finsqft):Monticello 0.194
(0.011)
log(finsqft):Western Albemarle 0.215
(0.011)
Num.Obs. 32502 32502
R2 0.675 0.679
R2 Adj. 0.675 0.679
AIC 871946.3 871508.5
BIC 871988.3 871567.3
Log.Lik. -10699.623 -10478.719
F 22464.295 13751.153
RMSE 0.34 0.33

Visualize Model

Visualize the model using the ggeffects package (Lüdecke 2018).

Ages of Homes by HS district

Ablemarle

hist(homes$age[homes$hsdistrict == "Albemarle"], main = "", xlab = "Age")

Monticello

hist(homes$age[homes$hsdistrict == "Monticello"], main = "", xlab = "Age")

Western Albemarle

hist(homes$age[homes$hsdistrict == "Western Albemarle"], main = "", xlab = "Age")

References

Arel-Bundock, Vincent. 2022. Modelsummary: Data and Model Summaries in r 103. https://doi.org/10.18637/jss.v103.i01.
Lüdecke, Daniel. 2018. ggeffects: Tidy Data Frames of Marginal Effects from Regression Models.” 3: 772. https://doi.org/10.21105/joss.00772.
R Core Team. 2024. “R: A Language and Environment for Statistical Computing.” https://www.R-project.org/.
Wickham, Hadley. 2016. ggplot2: Elegant Graphics for Data Analysis.” https://ggplot2.tidyverse.org.

  1. Collected in February 2024.↩︎

  2. As of 2023.↩︎